Identification of unobservable behavior in stochastic discrete event systems with a low number of sensors

Dynamic discrete event systems (DDES) are systems that evolve from the asynchronous occurrence of discrete events. Their versatility has become a critical modeling tool in different applications. Finding models that define the behavior of DES is a topic that has been addressed from different approaches, depending on the type of system to be modeled and the model's objective. This article focuses on the identification of timed models for stochastic discrete event systems. The identified model includes both observable and unobservable behavior. The objective of the method is achieved through the following steps:• Identifying the sequences of events observed at different time instances during the closed-loop operation of the system (observed language),• Inferring the stochastic behavior of time between events and modeling the observable behavior as a stochastic timed Interpreted Petri Net (st-IPN),• and finally, inferring the non-observable behavior using the language projection operation between the observed language and the language generated by the st-IPN.This method has novel aspects because it uses timed events, can be applied to systems with a low number of sensors and can infer unobservable behavior for any sequence of events.


Introduction
A DES model can be developed from an expert's knowledge of the system by applying refinement techniques (top-down design) or modular composition (bottom-up design), and identification methods. The identification methods start from the observed input-output signals of the system, which generate sequences of events that define the system's behavior [1] . The set of sequences forms the language  [2] . Additionally, when the events' time is considered, this is called timed language. Identification algorithms have the observed language as the input and a model representing the observed language as the output [ 3 , 4 ]; in turn, timed identification algorithms further infer the stochastic behavior of the transitions.
The observed language can be found by measuring the changes in the actuator signals, such as from switches and valves, or sensor signals, which change according to specific actions. The representation of the observed language as a DES is applied in the modeling of industrial processes, production systems, robotics, manufacturing processes, traffic systems, biological and chemical processes [5] , and telecommunication networks [6] , among others. When these processes have a large number of sensors and actuators, the observable behavior is generally sufficient to produce a model close to the real behavior of the system; however, when the system has a shortage of these reading and control elements, the language cannot be used to record the changes that occur during the process efficiently. This issue gives rise to unobservable behavior [7] that is not easily identifiable.
The identification of the unobservable behavior of DES represented as PNs is a problem that has been studied to a limited extent. This unobservable behavior may be due to unobservable states or unobservable transitions [8][9][10] in a closed-loop DES. The focus of this paper is the identification of unobservable states. In the literature reported on this approach, the proposed methods identify observable behavior as partial Petri nets and the model is completed by adding unobservable places. The addition of unobservable behavior is performed from the analysis of the observable IPN structure or combined with knowledge models, i.e., generated by experts, [ 10 , 11 ], for example, in [11] , the model identified from the signal sequences is modeled as a Signal Interpreted Petri net (SIPN), then from the sequence represented in the SIPN an argued reachability graph is constructed and based on that graph, In [12] proposes a method for discovering sequential and concurrent relationships between events in the observed sequence, when two transitions are observed consecutively and one is systematically preceded by the other there is a sequential relationship and when two transitions have been observed consecutively in both orders there is a concurrent relationship, the unobservable places represent the internal behavior of the system and are added so that the final model represents the sequence of events, in the work [13] , the identified events that do not match the sequence of events can be considered unobservable events and are added to the model identified from the generation of T-invariants at the end, in [14] the observable behavior of a closed-loop, plant-controller system is modeled from the exchange of physical signals between the plant and the controller, the expiration of a timer, are assumed to be internal events whose occurrence generates the unobservable behavior, observable behavior is identified as IPN fragments and unobservable places are added for the model to represent the observed event sequences. Another approach is based on language theory, [15] , they discover unobservable behavior based on the projection of the firing sequence obtained in the first step on subalphabets, and finds specific patterns that are characteristic of the dependency relationships between the transition firings. Unlike the previous approaches, in [ 7 , 16 ], once the sequence of events has been observed, they first perform a language reduction from a synthesis approach against a design parameter of word size r, then, they define a language superior to that of the identified net, from the observed marking achieved during the firing of the sequence of transitions, finally an optimization approach based on integer linear programming is used to find the unobservable behavior.
In general, these approaches propose inferring unobservable behavior by discovering dependence relationships in the observed sequences, comparing them with knowledge models, or applying optimization techniques. Still, they do not report specific instrumentation conditions of the automatic system to be identified. In the present work, we propose to infer the unobservable behavior without using the observable IPN structure; moreover, the main novelty lies in the fact that the proposed method can be applied to a reactive DES, that is, to a closed-loop system consisting of a plant and a controller exchanging signals, where the plant has a low number of sensors; and, to identify the observable and unobservable behavior, the input information is not only constituted by the sequences of the changes in the I/O signals but also the time in which these changes occur is taken into account. Based on these characteristics, the method can be applied in different industrial automation systems, such as communication systems, transport systems, systems where timing information becomes more critical, or production systems where sensorization levels are low.
The article is organized as follows: section 2 presents the conceptual basis of the proposal, section 3 describes the method for identifying unobservable behavior, section 5 presents the method validation and discussion, and finally, section 6 presents the conclusions of the work.

Basic concepts
Modeling the behavior of a discrete event system (DES) is essential for conducting functional analysis studies, carrying out performance evaluations, and applying simulation techniques, among other applications. The dynamics of these systems are represented based on the asynchronous occurrence of discrete events; some of these events are controlled, whereas others are not, and some of these events are observed by sensors, whereas others are not [2] . A reactive DES is a closed-loop system consisting of a plant and a controller exchanging signals. The most commonly used representation formalisms for DESs are Automata and Petri nets (PNs). PNs offer advantages for modeling DES because they represent concurrent, asynchronous, distributed, parallel, nondeterministic and/or stochastic behavior. This work is based specifically in two types of PNs are Interpreted Petri nets (IPN) and timed Petri nets (t-PN) [17] .
The following are the basic concepts and basic notations used in the article.

Definition 1. Petri Nets (PN) .
PN are models capable of describing the total information flow of a system with concurrent and distributed processes, and they are widely used to model DESs. PNs provide compact models and capture important characteristics of DESs, such as concurrency, synchronism, causal relationships, and shared resources.
The PN structure is a bipartite digraph represented by the five-tuple = ( , , , , 0 ) , where is a set of places with cardinality , is a set of transitions with cardinality , and ∶ × → , ∶ × → are the Pre -and Postincidence matrices, respectively, which specify the arcs connecting the places and transitions. Matrix = − is the × incidence matrix of the net. The marking function ∶ → represents the number of tokens residing in each place, and 0 is the initial marking [18] .

Background
The present work is based on the modeling proposal presented by [1] ; some of the considerations of this proposal are the following:

Definition 3. Input -Output (I/O) Vector
For simplicity in representation, I/O vectors are encoded using their decimal representation; therefore, an input (output) symbol is defined as the set of control command values (sensor readings) at an instant of time in decimal representation. Therefore, an I/O symbol is defined as: where, stands for the input symbol in decimal, and = 0; ; the same applies to , stands for the output symbol in decimal, and = 0; ; m and n are the numbers of inputs and outputs, respectively.

Definition 4. Interpreted Petri Net (IPN)
is an input symbol, and is the number of inputs; = { 0 , ⋯ , 2 −1 } is the output alphabet, is an output symbol, and is the number of outputs; ∶ → is a labeling transition function that assigns an input symbol to each transition and ∶ → is an output function that assigns an output symbol to each place.
The system alphabet Ω relates the I/O symbols; specifically, Ω = ⋅ . Therefore, an event such that ∈ Ω, is on the form = , according to Eq. (1) .

Definition 5. Event
A new event, , is generated when there is a change in , , or both. For the above example, If the readings observed at different time instants ( ) are Then, the initial event is 0 = ( [ 00 ] , [ 00 ] ) , i.e. 0 = ( 0 0 ) ; and the following events are represented as follows: There is an event from 0 to 1 because there is a change in both the input and output signals, 1 = ( 2 2 ) ; there is another event from 1 to 2 because there is a change in the input signal but not in the output signal, 2 = ( 3 2 ) , there is another event from 2 to 3 because there is a change in the output signal but not in the input, 3 = ( 3 3 ) and there is another event from 3 to 4 because the input signals change and a change in an output signal is generated, 4 = ( 0 1 ) .

Definition 6. Time event
A timed event is defined as where = | − −1 |, and is the time instant when the event i occurred. When the DES is stochastic, the timed information is represented by the mean and standard deviation calculated from different operating cycles of the system.
A sequence is a concatenation of timed events organized in a timeline: where ∶ 1 ⋯ , is the number of observed sequences. The set of sequences constitutes the observed behavior of the system or the observed language, . , , ) is an IPN, , , have the same meaning as Definition 4 , but the input function is defined as ∶ → . , a labeling function that assigns an input symbol and a density function to each transition, and is defined as ∶ → , where is isomorphic over .
The statistical behavior of the transitions, , is inferred from the time data observed during many cycles of operation of the system.

Definition 8. Language generated by a st-IPN ( )
is composed of all event sequences ( ) that evolve the st-IPN. That is where = 0 ⋯ and is an event as defined in the Eq. (3) .

Definition 9. Input language ( ( ) )
The ( ) constitutes the sequences of the input symbols of a language. According to [3] , the ( ) is found from the language projection operation as follows: where,

Definition 10. Output language ( ( ) )
The ( ) constitutes the sequences of the output symbols of the language. According to [3] , the ( ) is found from the language projection operation as follows:

Problem statement
One way to find the behavioral model of a DES is from identification methods. The identified models have an observable and an unobservable behavior. The observable behavior is constructed from the system's input and output (I/O) signals. The unobservable behavior is inferred from the I/O signals; this behavior is generated when there are changes in the internal dynamics of the system, without having generated changes in the I/O signals; one cause is that the system is not entirely sensorized, an aspect that becomes important in industrial applications. The models obtained through identification have different purposes, such as fault diagnosis, validation and synthesis of controllers, so models that represent the system's dynamics as real as possible are required. Proposals to identify unobservable behavior under the Petri net formalism have been reported in the literature from different approaches: one approach compares the observable behavior with previous models given by experts and adds the missing sites, another approach conditions the observable NP to the fulfillment of T-Invariants; and another approach is the one presented in [15] where the unobservable part is determined by projecting the activation sequences of the sub-alphabets to discover specific patterns, which are characteristic of the dependency relationships between transitions, an approach that has a weak point when in the sequence of events no patterns can be found, for example, strings of events that only occur once in the system and also, in very long sequence strings that proposal presents a problem in the choice of the parameter of the size of the languages.
Considering the above, this paper proposes a method to infer the unobservable behavior for stochastic DES with a low number of sensors, from timed event sequences, observed during many cycles of system operation and that allows modeling event sequences independent of the number of repetitions it has, based on the proposal to identify the observable behavior presented in [1] .

Method details
The proposed method consists of two stages: the identification of the structure of an st-IPN (see Definition 7 ) from sequences of timed events observed as defined in Definition 6 , of a system as described in Definition 2 , and then the addition of the unobservable behavior to the structure of the identified st-IPN based on the projection of languages.
Unlike reference [1] , In the present proposal the system is not divided into subsystems; furthermore, the identification algorithm works offline.  Fig. 1. Representation of an st-IPN.

Identification of the observed behavior
(1) To observe the operation of the system to be identified for several cycles and to record the information of the input/output signals, as well as the time, in a table as shown in Table 1 .

Event time definition
The times of the events in each sequence are stored in a matrix, where the number of columns is the number of cycles observed in the sequence, and each row represents the observed time of the event. Once the system has been observed for several cycles, the mean and deviation of the time in each event and in each sequence can be determined.

Modeling of observable behavior
Each of the identified sequences is represented in an st-IPN. The input signals, , are associated as the labels of the transitions, as well as the mean and standard deviation, and the output signals, , are the labels of the places ( Fig. 1 ).
The identification of the observable st-IPN is made based on Algorithm 1 .

Elimination of transition self-loops
Once the st-IPN is available, a net reduction process is performed based on the elimination of the transition self-loops. Specifically, a transition is eliminated if its places and are the same ( ⋅ = ⋅) . Additionally, a transition is eliminated if the associated time is deterministic, and if there is another transition with the same input signals but with a different output place. Indeed, these transitions are assumed to be transitions associated with an event that has been generated due to the settling time in the response of the sensors. Fig. 2 presents the structure of an st-IPN in which the 1 is eliminated.

Modeling of unobservable behavior
The present proposal for modeling unobservable behavior is based on analyzing the generated language and the input and output languages of the st-IPN obtained in the previous step. The following procedure is proposed to model unobservable behavior.
(2) Find the input and output languages of the observed and st-IPN-generated languages. The input and output languages of the observed and st-IPN generated languages are found based on Eqs. (7) and 8 . Therefore, the input language will be ( ) = ( 0 ( 0 , 0 ) ⋯ ( , ) ) and the output language will be ( ) = ( 0 ⋯ ) . (3) Identify event subsequences where unobservable places may be present. A sub-sequence of timed events, such that the ( ) is composed of identical output symbols and in its ( ) , the input symbols are not identical, it may be that there are unobservable places between the transitions involved in the sub-sequence. This phenomenon occurs because it is not possible to model the states to which the system has evolved from the input signals due to the low number of sensors. (4) Infer unobservable places from observable behavior. It is proposed to infer the unobservable places between two transitions if the sub-sequence identified in previous step two or more consecutive transitions with different input symbols are generated starting from the same place . (5) Add the unobservable places to the previously identified st-IPN.
As can be seen, the language generated by the st-IPN represents behaviors that are not identified in the system. Since the observed language of a DES defines its behavior, it is a primary condition that the models generate the same language. When a DES is not sufficiently sensorized, it is common to be faced with the problem of identifying models that do not generate the same language, but including unobservable places can solve this issue. For this purpose, the observed and generated ( ) and ( ) languages are analyzed to establish between which transitions the unobservable places should be included.

Contribution
The representation formalism used to model behavior based on an st-IPN, which allows modeling changes in output signals that depend on changes in input signals, assigning labels to transitions and places; but also capturing the stochastic behavior of events. This feature has advantages over proposals that model only the behavior of signals through IPNs [ 15 , 19 ] or only the timed behavior [3] .
In addition, the inference of unobservable behavior based on the analysis of the timed sequences of input events allows the addition of places whose dynamics is related to the evolution of the internal state of the plant caused by low plant sensing or by timed responses, without incurring computational costs that are generated when optimization techniques or invariant calculations are applied.

Method validation
This section presents the proposed method's results in a case study. The case study is an industrial application with behavior that evolves from states and whose behavioral model is required as a DES. Its objective is to glue a soft pad over an armrest. The line has a conveyor, a glue injector with a sliding movement, a pad chute with an inclined belt, a pad dispenser and placer, a pad press station whit a vibrator, cylinders along the line that perform the function of barrier and traffic control, an automatic wrapper station, and an HMI screen where the parameterization, control, and supervision of the entire system is performed. The plant under study has a stochastic behavior and is instrumented with six sensors and 16 actuators which are listed to following: It is desired to find the model that represents the observable and unobservable behavior as a DES. For this purpose, it is proposed to use the method presented in this article.
When observing the operation of the plant from the initial event ( 9 , 0 ) , it was found that its behavior is defined by three sequences of events, which are shown in Table 2 .
Applying Algorithm 1 to the observed sequences, reported in Table 2 , the resulting st-IPN is shown in Fig. 5 , representing the observable behavior. The st-IPN with observable and unobservable behavior is shown in Fig. 6 .

Discussion
By applying the method to identify observable behavior, the resulting model represents a dynamic system that reacts to the input signals, thus changing the outputs. Therefore, when there is no state change in the model before a sequence of input events related to the control commands, the proposed method infers the unobservable behavior by adding places to the observable model. In the identified PNs, which are shown in Figs. 4 and 6 , with the inclusion of this type of places it was possible to eliminate sequences that did not belong to the system behavior, making = . Therefore, this model helps to represent system behaviors that could be closer to the real behavior of the system. Aspects are important when the resulting models are used to verify system performance, such as validating control strategies or fault diagnosis. Conversely, this proposal allows for the modeling of causal and concurrent events. Causal events can be modeled because the observed sequences are generated in an orderly manner as the system is observed, and, in turn, the sequences are cyclic, and their representation in the st-IPN guarantees their order. Moreover, two events can be triggered simultaneously when more than one sequence is modeled. This situation can be observed in Fig. 6 , when if place p6 is active, either transition 19 or 20 can be triggered. In the introductory example, another sequence of events can be used: ) .
The st-IPN for this sequence is shown in Fig. 7. With this variation, it is observed that when the place 2 is active, the transitions

Conclusions
This paper presents an approach to solve the problem of inferring the unobservable behavior of a stochastic discrete event system. The method starts from the observation of the system and the generation of timed input-output events that represent the signals that are exchanged in a closed-loop DES. This method is applicable in DESs that react to the execution of input signals coming from the controller, with changes in the output signals. The method has two components: first, it identifies the structure of an st-IPN, the input and output functions associated with the PN and the temporal behavior of the transitions, and second, it infers the unobservable behavior based on language theory, namely by comparing the observed language with the one generated by the st-IPN and adding places for these languages to be the same.
In addition, the output function of the proposed method does not assign only the activation of one sensor reading to each place, but the combination of the activation of different readings. This condition is useful to improve the event-detectability of the resulting model; since to check this condition, the product of the incidence matrix by a matrix , which relates the sensor readings to the identified locations, must be found. Suppose the resulting matrix has linearly independent and non-zero columns, in that case, the system modeled in the st-IPN is event detectable, i.e., a unique identifier is associated with the triggering of each of the transitions.

Limitations
The method has limitations, for example, it infers the observable and unobservable states in a DES, represented as an st-IPN; but not the unobservable transitions, on the other hand, the unobservable states are inferred from the reactive behavior of the input and output signals, when an input does not generate changes in the output; but it does not allow inferring if the unobservable behavior is due to some kind of failure, not necessarily due to lack of sensors.

Future work
In future work, it is proposed to extend the method to detect and isolate faults in DES. Also a study comparing the resolution of the problem of inference of unobservable states in systems with low number of sensors, of the proposed method with other approaches such as neural networks.